This report analyzes cases and deaths attributed to COVID-19 since January, 2020.
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots
Variables:
iso_code: ISO 3166-1 alpha-3 – three-letter country codes.continent: continent of the country.location: name of the state or federal entity.date: date of the observation (year-month).cases: new confirmed cases of COVID-19 per 1M people.deaths: new deaths attributed to COVID-19 per 1M people.df = pd.read_csv('./cases_deaths1.csv', header = 0)
Table 1. Cases and deaths from COVID-19.
df.tail(10)
| iso_code | continent | location | date | cases | deaths | |
|---|---|---|---|---|---|---|
| 140 | ITA | Europe | Italy | 21-Dec | 18174.3 | 59.2 |
| 141 | JPN | Asia | Japan | 21-Dec | 44.9 | 0.3 |
| 142 | GBR | Europe | United Kingdom | 21-Dec | 39855.7 | 53.7 |
| 143 | USA | North America | United States | 21-Dec | 18599.5 | 134.8 |
| 144 | ARG | South America | Argentina | 22-Jan | 24968.5 | 14.0 |
| 145 | AUS | Oceania | Australia | 22-Jan | 43586.3 | 12.6 |
| 146 | ITA | Europe | Italy | 22-Jan | 33626.8 | 46.2 |
| 147 | JPN | Asia | Japan | 22-Jan | 604.6 | 0.2 |
| 148 | GBR | Europe | United Kingdom | 22-Jan | 29878.9 | 40.0 |
| 149 | USA | North America | United States | 22-Jan | 27966.9 | 60.2 |
px.scatter(data_frame = df,
x = 'cases',
y= 'deaths',
size = 'deaths',
color = 'location',
title = 'Cases per million vs deaths per million',
labels = {'cases': 'cases per million', 'deaths': 'deaths per million'},
# Scales
range_x = [-1000, 50e+03],
range_y = [-100, 500],
# Hover over points
hover_name = 'location',
text = "iso_code",
# Animation based on date
animation_frame = 'date',
size_max = 100)
Figure. Animated pairplot: cases per million vs deaths per million attributed to COVID-19 in different countries. Circle size based on deaths per million. Color key for countries.
It is observed how the plot started with cases lower than 20k but many deaths attributed to COVID-19. By the end of 2021 and begining of 2022, values are below 100 and in some countries near 0 in y-axis showing the high contagiousness of the Omicron variant circulating and causing many cases and the good effectiveness of the vaccines reducing significantly the deaths. It is also noticed that COVID-19 arrived first in Europe or North America countries. Besides, the countries in opposite hemispheres show different behaviors in the coldest or warmest months. For example, Argentina shows the highest number of cases in May - June 2021, while northern countries such as United Kingdom or the United States show the highest in December 2020 and January 2021.
px.area(data_frame = df,
x = 'date',
y= 'cases',
facet_col="location",
facet_col_wrap=2,
color = 'location',
title = 'Cases per million')
Figure. Faceted area plot: date vs cases per million due to COVID-19 in different countries.
This plot shows the different waves of COVID-19 through time.
px.area(data_frame = df,
x = 'date',
y= 'deaths',
facet_col="location",
facet_col_wrap=2,
color = 'location',
title = 'Deaths per million')
Figure. Faceted area plot: date vs deaths per million attributed to COVID-19 in different countries.
Here we could observe the effectiveness of vaccines in the last period of time.
Table 2. Data from Argentina.
data_ARG = df.where(df.location == 'Argentina').dropna()
data_ARG.tail(10)
| iso_code | continent | location | date | cases | deaths | |
|---|---|---|---|---|---|---|
| 90 | ARG | South America | Argentina | 21-Apr | 13782.1 | 175.6 |
| 96 | ARG | South America | Argentina | 21-May | 17638.6 | 312.0 |
| 102 | ARG | South America | Argentina | 21-Jun | 15098.7 | 355.5 |
| 108 | ARG | South America | Argentina | 21-Jul | 10073.1 | 250.3 |
| 114 | ARG | South America | Argentina | 21-Aug | 5610.2 | 133.6 |
| 120 | ARG | South America | Argentina | 21-Sep | 2019.8 | 11.6 |
| 126 | ARG | South America | Argentina | 21-Oct | 699.6 | 16.9 |
| 132 | ARG | South America | Argentina | 21-Nov | 919.6 | 14.0 |
| 138 | ARG | South America | Argentina | 21-Dec | 7096.9 | 12.7 |
| 144 | ARG | South America | Argentina | 22-Jan | 24968.5 | 14.0 |
# Figure with secondary y-axis
fig = make_subplots(specs=[[{"secondary_y": True}]])
# Traces
fig.add_trace(
go.Bar(x = data_ARG['date'],
y= data_ARG['cases'],
name="cases per million"),
secondary_y=False,
)
fig.add_trace(
go.Line(x = data_ARG['date'],
y= data_ARG['deaths'],
name="deaths per million"),
secondary_y=True,
)
# Title
fig.update_layout(
title_text="Argentina"
)
# x-axis title
fig.update_xaxes(title_text="date")
# y-axes titles
fig.update_yaxes(title_text="cases per million", secondary_y=False)
fig.update_yaxes(title_text="deaths per million", secondary_y=True)
fig.show()
Figure. Multiplot: date vs cases per million and deaths per million in Argentina.
LINK to plotly image
Here we observe the dates when different COVID-19 waves happened in Argentina. First one in October 2020 and second one in May - June 2021, both with high number of deaths. The third wave is already happening, in January 2022, but it is well observed the low number of deaths in this case thanks to vaccines effectiveness.
px.area(data_frame = data_ARG,
x = 'date',
y= 'deaths',
labels = {'deaths': 'deaths per million'},
title = 'Deaths per million in Argentina')
Figure. Area plot: date vs deaths per million in Argentina.
px.scatter(data_frame = data_ARG,
x = 'date',
y= 'cases',
size = 'deaths',
color = 'deaths',
title = 'Cases per million and death per million in Argentina',
labels = {'cases': 'cases per million', 'deaths': 'deaths per million'},
# Hover over points
hover_name = 'deaths',
text = 'deaths',
color_continuous_scale = 'sunset',
size_max = 100)
Figure. Pairplot: date vs cases per million in Argentina. Circle size and color based on deaths per million and text inside indicates its value.